Discovering Periodic Signals in Images With Topological Data Analysis

Nicholas Carrara and Justin Curry

University at Albany

Notes for Nick:

  • We want to start with a visual demonstration of what traditional TDA methods has done and what we hope to do.
  • Please steal the jumping jacks video from Chris Tralie's TDA labs on one slide
  • Please insert video of GoL simulation with spatially separated periodic features, e.g. gliders, etc.
  • Then we'll have an table of contents slide.
In [14]:
# consider the following video of a person doing jumping jacks, which clearly has some
# global periodic behavior.
CA.display_CA('jumpingjacks.ogg')
Out[14]:

Sliding Windows Embedding

C. Tralie's sliding windows embedding technique is a tool for finding periodic signals in time series data by applying persistence to embeddings of the signal.

In [15]:
CA.display_CA('jumpingjackssliding.ogg')
Out[15]:

Periodic structures in Cellular Automata

Cellular Automata is a well studied phenomenon that we can use as a benchmark for testing our tools.

In [16]:
CA.display_CA('movie2.mp4')
Out[16]:

Introduction to Persistent Homology

Persistent Homology is a technique for describing the Homology of a sample space by making various assumptions about the structure of the underlying manifold. One does this by choosing a filtration over the sample space and constructing its associated $n$-complexes. The homology of the complexes is evaluated at each level of the filtration. The homological features which happen to persist are then considered to be relevant.

title

The topological features that are created and destroyed by the filtration are represented by a barcode diagram,

title

In [22]:
# demonstration of persistence on a noisy circle
display(widgets.HBox((epsilonSlider, noiseSlider, pointsSlider)))
plt.figure(figsize=(9,5))
generate_PD()

Cellular Automata

A cellular automata is an array of cells with an associated state $a \in \mathcal{CA}$ and a rule, $f:\mathcal{CA}\rightarrow\mathcal{CA}$, for updating the state,

title

Example of one-dimensional rule for a binary state automaton. Each time an update is performed, $f(a)$, the rule here is only a function of the nearest neighbors of each cell.

The space of possible rules for a given $\mathcal{CA}$ can be arbitrarily large depending on the number of possible states and the parameters of $f$.

title

Here are some examples of one-dimensional Automata which are generated by the initial condition of one live cell in the middle,

Drawing Drawing Drawing Drawing Drawing Drawing Drawing Drawing Drawing Drawing Drawing Drawing
In [23]:
# The following code generates one-dimensional CA with the following settings,

def on_value_change(change):
    generate_CA()
    
ruleSlider = widgets.IntSlider(min=0,max=255,value=18,description='Rule:',continuous_update=False)
ruleSlider.observe(on_value_change, names='value')

randomSlider = widgets.FloatSlider(min=0.0,max=1,step=0.01,value=0,description='Percent Random: ',continuous_update=False)
randomSlider.observe(on_value_change, names='value')

widthSlider = widgets.IntSlider(min=1,max=500,step=1,value=50,description='Width: ',continuous_update=False)
widthSlider.observe(on_value_change, names='value')

heightSlider = widgets.IntSlider(min=1,max=500,step=1,value=50,description='Height: ',continuous_update=False)
heightSlider.observe(on_value_change, names='value')




def generate_CA():    
    plt.clf()
    initState = [[0 for i in range(widthSlider.value)]]
    
    
    # Get slider values
    rule = ruleSlider.value
    if randomSlider.value == 0:
        initState[0][int(widthSlider.value/2)] = 1
    else:
        randomState = [i for i in range(widthSlider.value)]
        random.shuffle(randomState)
        for j in range(int(widthSlider.value*randomSlider.value)):
            initState[0][randomState[j]] = 1
    width = widthSlider.value
    height = heightSlider.value
    
    ca = CA.Automata(1, width, 2, 0, rule, 0, .5, height)
    ca.initializeOneDimensionalEmptyCells()
    ca.setCells(initState)
    sequence = ca.generateOneDimensionalSequence()    
    
    #Step 4: Plot 
    plt.imshow(sequence)
    plt.tight_layout()
In [24]:
# setup display
display(widgets.HBox((ruleSlider, randomSlider, widthSlider, heightSlider)))
plt.figure()
generate_CA()

Wolfram's Universality Classes

The four basic classes that Wolfram assigned to each rule were; Stable, Periodic, Chaotic and Complex.

title title

We would like to explore the topological properties of rule space, starting with this particular special case. We'll be looking at several methods centered around persistent homology.

In [26]:
# setup display
display(widgets.HBox((ruleSlider, randomSlider, widthSlider, heightSlider)))
plt.figure(figsize=(9,5))
generate_CA_TDA()

Persistent Homology Dimension

One topological descripter is the persistent homology dimension defined in the paper by MacPherson and Schweinhart: https://arxiv.org/abs/1011.2258, which does the following transformation,

title

PH Dimension is a global description of the persistent homology of a data set and is meant to capture the self-similarity of the space. In some cases P.H. Dimension coincides with Hausdorff dimension, as in the case of the Serpinski gasket $d = -\log 3/ \log \rho$,

title

Question:

Can we use P.H. dimension and self-similarity to classify the rule space of Cellular Automata?

Cubical Ripser

To compute the persistent homology of a cellular automata, we used a variant of the RIPSER algorithm applied to cubical toplexes: https://github.com/infophysics/CubicalRipser2D. Connectedness if determined by the von Neumann neighborhood; i.e.

title

Here are some examples showing the Persistent Homology Dimension for various rules,

title title

Rule 60 should give a P.H. Dimension close to the Serpinski gakset $(\sim \log(3))$, however we get the value $d = 2.52$. We compared the result using the standard RIPSER algorithm by constructing a point cloud from the $\mathcal{CA}$,

title title

At this point the results are still unclear, and the size of the automata seems to have an affect on the stability of the P.H. dimension,

title title

To compare with other methods for computing fractal dimension, we used a simple implementation of the box-counting method, which does far better at estimating the fractal dimension as is also incredibly stable. Below is a plot comparing the $L^1$ filtration PH Dimension of Rule 60 with the box-counting dimension,

title

Multi Dimensional Scaling

Multidimensional scaling (MDS) was used to explore whether the collection of 1DCA resulting from each of the 256 rules may have a lower dimensional representation.

MDS attempts to represent data in a lower dimensional space while preserving distances between individual data points.

It can be treated as an optimization problem: \begin{equation*} \begin{aligned} & \underset{x_1,\dots ,x_n}{\text{min}} & & \sum_{i<j}(\|x_i-x_j\|-\delta_{i,j}) \end{aligned} \end{equation*} where each $x_i$ low dimensional vector representing the ith data point, and $\delta_{i,j}$ is the $i,j$ entry of the distance matrix of the data.

One Cell Initialization

title

title

The dataset was generated from a 101 cell array randomly initialized with approximately 50% of cells in state 1. Each of the 256 rules was applied over 100 iterations to create an image. An example is shown below:

title

To prepare the data for MDS, images were flattened into (101$\times$100) dimensional vectors. Two compontent MDS was applied.

Random Initial Conditions

title

title

title

title

Remarks:

Under randomized initial conditions, the Class 1 rules were grouped into two antipodal clusters, corresponding to a steady state of all cells in state 1 or all cells in state 2.

Class 3 and 4 mostly clustered together, with Class 3 occupying the central part of each cluster, and Class 4 occupying the exterior.

One MDS coordinate seemed to correspond to the net concentration of state 1 or state 0 cells in an image.

Sliding Windows Embedding

We used 1D persistent homology to measure the geometric existence of loops.

title

A periodicity score may be obtained from a sliding windows embedding by selecting the maximum $H_1$ persistence bar length over a range of window sizes.

This periodicity score can be combined with the two MDS components shown previously to give a 3 dimensional representation of 1DCA rules:

In [27]:
# some 1D rules MDS
CA.display_CA('Plots/3D_Plot.mp4')
Out[27]:
In [29]:
# Here is the same projection with only class 3 and 4
CA.display_CA('Plots/3d_Plot_Class3_and_4.mp4')
Out[29]:

Predictions

Predictions without MDS
Actual Class I Predict II Predict III Predict IV Predict Accuracy
I 24 0 0 0 1
II 25 169 0 0 .871
III 1 16 15 0 .454
IV 0 5 1 0 0.0
Predictions with MDS
Actual Class I Predict II Predict III Predict IV Predict Accuracy
I 23 1 0 0 .958
II 0 187 5 2 .964
III 0 7 25 0 .781
IV 0 2 1 3 0.5

Conway's Game of Life

Typical studies of two-dimensional Automata are that of binary cells with "totality" rules. Totality rules ignore directionality of nearest neighbors and instead update only with respect to the number of neighbors that are alive in their vicinity.

Drawing Drawing

To generate a CA like the game of life, we will use the rulestring B/S/n convention. From Wikipedia;

"Generations rules are described by rulestrings of the form Bx/Sy/n[note 2] or y/x/n, an extension of B/S and S/B rulestrings where n is any natural number ≥2. Patterns evolve according to the following rules:

A cell in state 0 ("dead") will advance to state 1 ("get born") in the next generation of the pattern 
if the number of neighbors in state 1 ("live") in its Moore neighbourhood is present in the rule's 
birth conditions (Bx).

A cell in state 1 ("live") will:
    Remain in state 1 ("survive") in the next generation of the pattern if the number of neighbors 
    in state 1 ("live") in its Moore neighborhood is present in the rule's survival conditions (Sx).
    Advance to state 2 ("age") in the next generation of the pattern otherwise.

A cell in state m ≥ 2 will advance to state ((m + 1) mod n) in the next generation of the pattern. 
In particular, a cell in state n will reset to state 0 ("die").

Any outer-totalistic Life-like cellular automaton with rulestring B.../S... is equivalent to the Generations rule with rulestring B.../S.../2."

So, for Conways rules the rulestring will be - "B3/S23/2". If we want to generate a CA using Conways rules, we would instantiate the following object;

In [30]:
# generate a CA with conways rules, 50x50 cells, toroidal topology, a .37 density and 30 time steps.
ca = Conway("B3/S23/2", 50, 1, .37, 30)

# to see how the rule is converted to a number,
print("Number conversion -",ca.getRule())  # 6152
Number conversion - 6152

From here we can generate a random CA sequence using these rules

In [31]:
# initialize cells with density of .37
ca.initializeTwoDimensionalCells()

# generate a sequence using the specified time of 30
sequence = ca.generateTwoDimensionalSequence()

To save the sequence, use the save_CA function

In [32]:
CA.save_CA(sequence, 'movie.mp4')

Then we can display it using the display_CA function

In [33]:
CA.display_CA('movie.mp4')
Out[33]:

APGcode Format

Using the apgcode format, we can imbed any CA object into the grid. From Wikipedia;

Non-oversized still lifes, oscillators and patterns are encoded in extended Wechsler format, an extension of a pattern notation developed by Allan Wechsler in 1992.

The pattern is separated into horizontal strips of five rows.
    Each strip, n columns wide, is encoded as a string of n characters in the set {0, 1, 2, ..., 8, 9, a, b, ..., v} denoting the five cells in a vertical column corresponding to the bitstrings {'00000', '00001', '00010', ..., '01000', '01001', '01010', '01011', ... '11111'}.
    The characters 'w' and 'x' are used to abbreviate '00' and '000', respectively, and the symbols {'y0', 'y1', y2', ..., 'yx', 'yy', 'yz'} are used to encode runs of between 4 and 39 consecutive '0's.
    Extraneous '0's at the ends of strips are not included in the encoding. (Note that in particular, blank five-row strips are represented by the empty string.)
The character 'z' separates contiguous five-row strips.

Examples

xq4_27deee6 encodes a HWSS:

In [34]:
# generate an empty CA and place the space ship "xq4_27deee6"
ca = Conway("B3/S23/2", 30, 1, .37, 100)
ca.initializeTwoDimensionalEmptyCells()
ca.placeObject("xq4_27deee6", 10, 10, True)
CA.display_state(ca.getState())

Now we'll animate this CA

In [35]:
# generate an animation of this CA
sequence = ca.generateTwoDimensionalSequence()
CA.save_CA(sequence, 'movie2.mp4')
In [36]:
# display an animation of this CA
CA.display_CA('movie2.mp4')
Out[36]:

Let's try a larger CA

In [37]:
# set up a queen bee initialization
ca = Conway("B3/S23/2", 30, 1, .37, 100)
ca.initializeTwoDimensionalEmptyCells()
ca.placeObject("xp30_w33z8kqrqk8zzzx33", 5, 5, True)
CA.display_state(ca.getState())
In [38]:
# generate an animation of this CA
sequence = ca.generateTwoDimensionalSequence()
CA.save_CA(sequence, 'movie2.mp4')

Let's Display This CA

In [39]:
CA.display_CA('movie2.mp4')
Out[39]:

One interesting repeater is the Gosper gun

title

The Figure below shows an example of 1D persistence homology, the H1 curve is due to the presence of loops of the same class as result of the oscillators motion in over short time periods in the video. The two other points in the 1D persistence diagram correspond to the loops created by the periodicity of the glider guns motions, clearly these loops have much higher persistence than the oscillators loops.

Colors in the PCA embedding indicate the position of the window in time, with cool colors (greens and yellows) indicating earlier windows and hot colors (oranges and reds) indicating later windows.

The roundness in the PCA projection embedding corresponds to the closeness of the eigenvalues to each other, more close values results more circular shape in the PCA projection.

title

Multi-State Rules

Other rule spaces in two-dimensional Automata can have interesting self-replicating behavior, such as $B3/S23/11$, which is Conway's Rules for an 11 state Automata. The following example is a "reflectionless rotating oscillar", for which we can run sliding windows embedding to determine it's periodic behavior. This rotator has rule code - xp32_ss2gzw11_wsfm76cg_x86668_wke2024_ws6i1_x1

In [40]:
CA.display_CA('Plots/refl_rotator.mp4')
Out[40]:

Sliding Window Embedding of the reflectorless rotating oscillator.

title

Oscillators in Conway's Game of Life

In cellular automata, an oscillator is any pattern of cells that will repeat itself in a later generation. There are hundreds of known oscillators in Conway's Game of Life.

Sliding Windows Embedding can also be used to measure periodic behavior in these oscillators.

Goals:

1. Use SWE to identify an oscillator randomly selected from a list of known oscillators.

2. Find the maximum persistence of an oscillator across a range of window sizes, and compare to the natural period of the oscillator.

3. Use the maximum persistence value as a factor to predict whether a patch of cells contains an oscillating pattern.
In [41]:
CA.display_CA('Plots/osc.mp4')
Out[41]:

Below is a plot of the maximum H1 persistence value for various sliding window embedding window sizes of the above oscillator

title

In [42]:
# generate an noisy CA and place the space ship "xq4_27deee6"

ca = Conway("B3/S23/2", 30, 1, .05, 100)
ca.initializeTwoDimensionalCells()
ca.placeObject("xq4_27deee6", 10, 10, False)
CA.display_state(ca.getState())
In [43]:
# generate an animation of this CA
sequence = ca.generateTwoDimensionalSequence()
CA.save_CA(sequence, 'movie3.mp4')
In [44]:
CA.display_CA('movie3.mp4')
Out[44]:
In [45]:
# Below is a large CA with lots of noise introduced.

CA.display_CA('Plots/noise.mp4')
Out[45]:

And the corresponding max persistence for varying window size,

title

Future Research

  1. Understand the subtleties with PH Dimension with respect to cubical complexes.

  2. Understand whether topological properties generalize Wolfram's Universality classes.

  3. Understand the cause of the continuous curve formed in the Sliding Windows Embedding persitence diagram of the Glider Gun.

  4. Improve classification accuracy of oscillators in the presence of noise.